. itle: RICE MLH 1 ORTHOLOG AND USES THEK . 
Inventors): Pramod B. Mahajan 

Application No: Not yet assigned . , 

AttyDkt No: 35718/238971 (5718-142) 

Complete Nucleotide and Deduced Amino Acid Sequence of Rice homolog of MLH1 

1 CGGCACGAGATTTTGCAGTCTCCTCTCCTCCTCCGCTCGAGCGAGTGAGTCCCGACCACG 60 

6 1 TCGCTGCCCTCGCCTCACCGCCGGCCAACCGCCGTGACGAGAGATCGAGCAGGGCGGGGC 120 

121 ATGGACGAGCCTTCGCCGCGCGGAGGTGGGTGCGCCGGGGAGCCGCCCCGCATCCGGAGG 180 
MetAspGluProSerProArgGlyGlyGlyCysAlaGlyGluProProArglleArgArg 

181 TTGGAGGAGTCGGTGGTGAACCGCATCGCGGCGGGGGAGGTGATCCAGCGGCCGTCGTCG 240 
LeuGluGluSerValValAsnArglleAlaAlaGlyGluVallleGlnArgProSerSer 

241 GCGGTGAAGGAGCTCATCGAGAACAGCCTCGACGCTGGCGCCTCCAGCGTCTCCGTTGCG 300 
AlaValLysGluLeuIleGluAsnSerLeuAspAlaGlyAlaSerSerValSerValAla 

301 GTGAAGGACGGTGGCCTCAAGCTCATCCAGGTCTCCGATGACGGCCATGGCATCAGGTTT 360 
ValLysAspGlyGlyLeuLysLeuIleGlnValSerAspAspGlyHisGlylleArgPhe 

361 GAGGATTTGGCAATATTGTGCGAAAGGCATACTACCTCAAAGTTATCTGCATACGAGGAT 420 
. GluAspLeuAlalleLeuCysGluArgHisThrThrSerLysLeuSerAlaTyrGluAsp 

421 CTGCAGACCATAAAATCGATGGGGTTCAGAGGGGAGGCTTTGGGTAGTATGACTTATGTT 4 80 
LeuGlnThrlleLysSerMetGlyPheArgGlyGluAlaLeuAlaSerMetThrTyrVal 

'-.* *. • . . 

'481 GGCCATGTTACCGTGACAACGATAACAGAAGGCCAATTGCACGGCTACAGGGTTTCTTAC 540 
GlyHisValThrValThrThrlleThrGluGlyGlnLeuHisGlyTyrArgValSerTyr 

541 AGAGATGGTGTAATGGAGAATGAGCCTAAGCCTTGCGCTGCGGTGAAAGGAACTCAAGTC 600 
ArgAspGlyValMetGluAsnGluProLysProCysAlaAlaValLysGlyThrGlnVal 

601 ATGGTTGAAAATCTATTTTACAACATGGTAGCCCGCAAGAAAACATTGCAGAAGTCCAAT 660 
MetValGluAsnLeuPheTyrAsnMetValAlaArgL^ 

661 GATGACTACCCCAAGATCGTAGACTTCATCAGTCGGTTTGCAGTCCATCACATCAACGTT .720 
AspAspTyrProLysIleValAspPhelleSerArgPheAlaValHisHisIleAsnVal 

721 ACCTTCTCTTGCAGAAAGCATGGAGCCAATAGAGCAGATGTTCATAGTGCAAGTACATCC 780 
ThrPheSerCysArgLysHisGlyAlaAsnArgAlaAspValHisSerAlaSerThrSer 

78 1 ~ TC AAGGT TAG AT G C TAT C AGG AG T GT C T ATGGGGC T TC TGT CGT f CGTG ATCTC AT AG AA 840 " 
SerArgLeuAspAlalleArgSerValTyrGlyAlaSerValValArgAspLeuIleGlu 
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8 4 1 ATAAAGGTTTCATATGAGGATGCTGCAGATTCAATCTTCAAGATGGATGGTTACATCTCA 

IleLysValSerTyrGluAspAlaAlaAspSerllePheLysMetAspGlyTyrlleSer 

901 AATGCAAATTATGTGGCAAAGAAGATTACAATGATTCTTTTCATAAATGATAGGCTTGTA 
AsnAlaAsnTyrValAlaLysLysIleThrMetlleLeuPhelleAsnAspArgLeuVal 

9 6 1 GACTGTACTGCTTTGAAAAGAGCTATTGAATTTGTGTACTCTGCAACATTGCCTCAAGCA 

AspCysThrAlaLeuLysArgAlalleGluPheValTyrSerAlaThrLeuProGlnAla 

1021 TCCAAACCTTTCATATACATGTCCATACATCTTCCATCAGAACACGTGGATGTTAATATA 
SerLysProPhelleTyrMetSerlleHisLeuProSerGluHisValAspValAsnlle . 

* • • . 

1081 CACCCAACCAAGAAAGAGGTTAGCCTTTTGAATCAAGAGCGTATTATTGAAACAATAAGA 
HisProThrLysLysGluValSerLeuLeuAsnGlnGluArgllelleGluThrlleArg 



900 



960 



1020 



1080 



1140 



1141 AATGCTATTGAGGAAAAACTGATGAATTCTAATACAACCAGGATATTCCAAACTCAGGCA 1200 
AsnAlalleGluGluLysLeuMetAsnSerAsnThrThrArgllePheGlnThrGlnAla 

1201 TTAAACTTATCAGGGATTGCTCAAGCTAACCCACAAAAGGATAAGGTTTCTGAGGCCAGT 1260 
LeuAsnLeuSerGlylleAlaGlnAlaAsnProGlnLysAspLysValSerGluAlaSer 

1261 ATGGGTTCTGGAAGAAAATCTCAAAAAATTCCTGTGAGCCAAATGGTCAGAACAGATCCA 1320 
MetGlySerGlyThrLysSerGlnLysIleProValSerGlnMetValArgThrAspPro 

1321 CGCAATCCATCTGGAAGATTGCACACCTACTGGCACGGGCAATCTTCAAATCTTGAAAAG 1380 
ArgAsnProSerGlyArgLeuHisThrTyrTrpHisGlyGlnSerSerAsnLeuGluLys 

1381 AAATTTGATCTTGTATCTGTAAGAAATGTTGTAAGATCAAGGAGAAACCAAAAAGATGCT 1440 
LysPheAspLeuValSerValArgAsnValValArgSerArgArgAsnGlnLysAspAla 

• • • • ' ' • 

1441 GGTGATTTGTCAAGCCGTCATGAGCTCCTTGTGGAAATAGATTCTAGCTTCCATCCTGGC 1500 
GlyAspLeuSerSerArgHisGluLeuLeuValGluIleAspSerSerPheHisProGly 

1501 CTTTTGGACATTGTCAAGAACTGCACATATGTTGGACTTGCCGATGAAGCCTTTGCTTTG 1560 
LeuLeuAspIleValLysAsnCysThrTyrValGlyLeuAlaAspGluAlaPheAlaLeu 

1561 ATACAACACAATACCCGCTTATACCTTGTAAATGTGGTAAATATTAGTAAAGAACTTATG 1620 
IleGlnHisAsnThrArgLeuTyrLeuValAsnValValAsnlleSerLysGluLeuMet 



1621 TACCAGCAAGCTTTGTGCCGTTTTGGGAACTTCAATGCTATTCAGCTCAGTGAACCAGCT -1680- 

TyrGlnGlnAlaLeuCysArgPheGlyAsnPheAsnAlalleGlnLeuSerGluProAla 
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1681 CCACTTCAGGAGTTGCTGGTGATGGCACTGAAAGACGATGAATTGATGAGTGATGAAAAG " 1740 
ProLeuGlnGluLeuLeuValMetAlaLeuLysAspAspGluLeuMetSerAspGluLys 

1741 GATGATGAGAAACTGGAGATTGCAGAAGTAAACACTGAGATACTAAAAGAAAATGCTGAG 1800 
AspAspGluLysLeuGluIleAlaGluValAsnThrGluIleLeuLysGluAsnAlaGlu 

1801 ATGATTAATGAGTACTTTTCTATTCACATTGATCAAGATGGCAAATTGACAAGACTTCCT 1860 
MetlleAsnGluTyrPheSerlleHisIleAspGlnAspGlyLysLeuThrArgLeuPro 

18 61 GTTGTACTGGACCAGTACACCCCTGATATGGACCGTCTTCCAGAATTTGTGTTGGCTTTA 1920 
ValValLeuAspGlnTyrThrProAspMetAspArgLeuProGluPheValLeuAlaLeu 

1921 GGAAATGATGTTACTTGGGATGACGAGAAAGAGTGCTTCAGAACAGTAGCTTCTGCTGTA 1980 
GiyAsnAspValThrTrpAspAspGluLysGluCysPheArgThrValAlaSerAlaVal 

1981 GGAAACTTCTATGCACTTCATCCCCCAATCCTTCCAAATCCATCTGGGAATGGCATTCAT 2040 
GlyAsnPheTyrAlaLeuHisProProIleLeuProAsnProSerGlyAsnGlylleHis 

• * * • * • 

2041 TTATACAAGAAAAATAGAGATTCAATGGCTGATGAACATGCTGAGAATGATCTAATATCA 2100 
LeuTyrLysLysAsnArgAspSerMetAlaAspGluHisAlaGluAsnAspLeuIleSer 

• • * . • • • 

2101 GATGAAAATGACGTTGATCAAGAACTTCTTGCGGAAGCAGAAGCAGCATGGGCCCAACGT 2160 
AspGluAsnAspValAspGlnGluLeuLeuAlaGluAlaGluAlaAlaTrpAlaGlnArg 

• ' • *' . 

2161 GAGTGGACCATTCAGCATGTCTTGTTTCCATCCATGCGACTTTTCCTCAAGCCCCCGAAG 2220 
GluTrpThrlleGlnHisValLeuPheProSerMetArgLeuPheLeuLysProProLys 

• • • • • • 

2221 TCAATGGCAACAGATGGAACGTTTGTGCAGGTTGCTTCCTTGGAGAAACTCTAC7VAGATT 2280 
SerMetAlaThrAspGlyThrPheValGlnValAlaSerLeuGluLysLeuTyrLysIle 

• ••••• 

2281 T T TGAAAGGT GTT AGCTCAT AAGTGAGAAAAT GAAGGCAGAGTAAGATCATGATT CATGG . 2340 
PheGluArgCysEnd * 

• • • • . . 

2341 AGTGTTTTTGAAAATGTGTATAATTTCACCGTATTATGTACTTTGATAGTGTCTGTAGAA 2400 

• • . 

2401 ACTGAAGAAAGAAAGATGGCTTTACTTCTGAATTGAAAGTTAACGATGCCAGCAATTGTA 24 60 

• • • ■ 

■ 24 61 TATTCTGATCAACCAAAAAAAAAAAAAAAAAAAAAAAAAAA 2501 
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AminoAcid Sequence of Rice Hbmolo&l^ofMLHL 



1 


MDEPSPRGGG 


CAGEPPRIRR 


LEESWNRIA AGEVIQRPSS 


AVKELIENSL 


51 


DAGASSVSVA 


VKDGGLKLIQ 


VSDDGHGIRF EDLAILCERH 


TTSKLSAYED 


101 


LQTIKSMGFR 


GEALASMTYV 


GHVTVTTITE GQLHGYRVSY 


RDGVMENEPK 


151 


PCAAVKGTQV 


MVENLFYNMV 


ARKKTLQNSN DDYPKIVDFI 


SRFAVHHINV 


201 


TFSCRKHGAN 


RADVHSASTS 


SRLDAIRSVY GASWRDLIE 


IKVSYEDAAD 


251 


SIFKMDGYIS 


NANYVAKKIT 


MILFINDRLV DCTALKRAIE 


FVYSATLPQA 


301 


SKPFIYMSIH 


LPSEHVDVNI 


HPTKKEVSLL NQERIIETIR 


■NAIEEKLMNS 


351 


NTTRIFQTQA LNLSGIAQAN PQKDKVSEAS MGSGTKSQKI 


PVSQMVRTDP 


401 


RNPSGRLHTY 


WHGQSSNLEK 


KFDLVSVRNV VRSRRNQKDA 


GDLSSRHELL 


451 


VEIDSSFHPG 


LLDIVKNCTY 


VGLADEAFAL IQHNTRLYLV 


NWNISKELM 


501 


YQQALCRFGN 


FNAIQLSEPA 


PLQELLVMAL KDDELMSDEK 


DDEKLEIAEV 


551 


NTEILKENAE 


MINEYFSIHI 


DQDGKLTRLP WLDQYTPDM 


DRLPEFVLAL 


601 


GNDVTWDDEK 


ECFRTVASAV 


GNFYALHPPI LPNPSGNGIH 


LYKKNRDSMA 


651 


DEHAENDLIS 


DENDVDQELL AEAEAAWAQR EWTIQHVLFP 


SMRLFLKPPK 


701 


SMATDGTFVQ VASLEKLYKI 


FERC* 





mutL/PMSl signature sequence is shown in bold. 
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Amino Acid Sequence Comparison nf Ri c e and Arahidnpsis mutL ITnnin/n p* 



2 depsprgggcagepprirrleesWnriaageviqrpssavkeliensld 51 

n 0 1 1 1 1 I M : I . I I I I I I I I I I f I I I I I I I I ||||||:||||| 

13 EEESPATTIVPREPPKIQRLEESWNRIAAGEVIQRPVSAVKELVENSLD 62 



101 



151 
162 



52 AGASSVSVAVKDGGLKLIQVSDDGHGIRFEDLAILCERHTTSKLSAYEDL 
„ 1 * 11:11 HI HI HI HI Ml I II I III I II III Mill. :||i 
63 ADSSSISWVKDGGLKLIQVSDDGHGIRREDLPILCERHTTSKLTKFEDL 112 

102 QTIKSMGFRGEALASMTYVGHVTVTTITEGQLHGYRVSYRDGVMENEPKP 
■ : I I I I I I I I I I I I I I I I I I I I I I I . M : | | | | | | | | | | | | | in 

113 FSLSSMGFRGEALASMTYVAHVTVTTITKGQIHGYRVSYRDGVMEHEPKA 

152 CAAVKGTQVMVENLFYNMVARKKTLQNSNDDYPKIVDFISRFAVHHINVT 201 
M II III 1:1 II I III II: l|:||| HI | | |. |||| : |, 

163 CAAVKGTQIMVENLFYNMIARRKTLQNSADDYGKIVDLLSRMAIHYNNVS 212 

202 FSCRKHGANRADVHSASTSSRLDAIRSVYGASWRDLIEIKVSYEDAADS 251 
. I I II III I :||||| • Mil. I Mill II :. I..:. || |.. 

213 FSCRKHGAVKADVHSWSPSRLDSIRSVYGVSVAKNLMKVEVSSCDSSGC 262 

252 I FKM DG Y I S NAN Y VAKKI TM I L F I N DRL VDCT ALKRA I E FV YS ATL PQAS 301 

oco 1 1 = 1' I M.I I II I I :: MM II M:|.|| | || || ||. II I 1. II 

263 T FDMEG F I SNSN Y VAKKT I L VL FIN DRL VECSAL KRAI E I VYAATL PKAS 312 

302 KPFIYMSIHLPSEHVDVNIHPTKKEVSLLNQERIIETIRNAIEEKLMNSN 351 

1 1 1:1 1 'I IIM:II IIMII III Mil III I.. :| || I.I 
313 KPFVYMSINLPREHVDINIHPTKKEVSLLNQEIIIEMIQSEVEVKLRNAN 362 

352 TTRI FQTQALNLSGI AQANPQKDKVSEASMGSGTKSQKI PVSQMVRTDPR 401 
363 DTRTFQEQKVEYIQ.STLTSQKSDSPVSQKPSGQKTQKVPWKM^TDSS 411 

402 NPSGRLHTYWHGQSSNLEKKFDLVS . VRNVVRSRRNQKDAGDLSSRHELL 450 

" 1 ' 1 1 " : • I • I I I . I I | | | | : | | | I ii. 

412 DPAGRLHAFLQPKPQSLPDKVSSLSWRSSVRQRRNPKETADLSSVQELI 4 61 

451 VE I DS S FH PGLLDI VKNCT YVGLADEAFALIQHNTRL YLVN WNI SKELM 500 

: II I I I : I : |:||MII:||-: I I I : I : I I III I I II : I II I I 
4 62 AGVDSCCHPGMLETVRNCTYVGMADDVFALVQYNTHLYLANWNLSKELM 511 

501 YQQALCRFGNFNAlQLSEPAPLQELLVMALKDDEL . . MSDEKDDEKLEIA 548 

C10 111 1 I' -1111111:1111 ll:.:|||:::| .| ||| | n 

512 YQQTLRRFAHFNAIQLSDPAPLSELILLALKEEDLDPGNDTKDDLKERIA 561 ' 

54 9 EVNTE ILKENAEMI NE YFS I H I DQDGKLTRLP WLDQYTPDMDRLPEFVL 598 
'•'IIMII III: lllhlll I • I I I I : I I I I I I I I I I I I I I 
-A62. EMNTELLKEI^EM^EYF^VHIDSSANLSRLPVILDQYTPDMDRVPEFLL 611 
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599 ALGNDVTWDDEKECFRTVASAVGNFYALHPPILPNPSGNGIHLYKKNRDS 648 

Mill 1:111 II. I. - I: Ml I 1:1 I !:| I I I I |. | | | | :| 
612 CLGNDVEWEDEKSCFQGVSAAIGNFYAMHPPLLPNPSGDGIQFYSKRGES 661 



64 9 MADEHAENDLISDENDVDQELLAEAEAAWAQREWTIQHVLFPSMRLFLKP 698 

: < : I ... I I : II .: I I I I I I I I I . I I I I I I I I I | | | | M 

662 SQEKS DLEGNVDMEDNLDQDLLS DAENAWAQREWS IQHVLFPSMRLFLKP 711 

699 PKSMATDGTFVQVASLEKLYKIFERC 724 

I I I I I I I I . I I I I II I II I I I I I 
712 PASMASNGTFVKVASLEKLYKIFERC 737 



Deduced amino acid sequences of Oryza sativa and Arabidopsis thaliana (Genbank ID, 
SP_PL:Q9ZRV4) were compared using the Bestfit program of GCG. 
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Comparison ofcDNA sequences ofMLHl oHhologs^jfrom A. thaliana and 



158 GGGAGCCGCCCCGCATCCGGAGGTTGGAGGAGTCGGTGGTGAACCGCATC 207 

I Mill M III I II II II || || || Mill III 

73 GAGAGCCACCGAAGATTCAACGCTTAGAAGAATCAGTAGTCAACCGTATC 122 
2 08 GCGGCGGGGGAGGTGATCCAGCGGCCGTCGTCGGCGGTGAAGGAGCTCAT 257 

INI II II II I MM Mi II || || Mm Ml Ml ! 

GCAGCTGGTGAAGTAATCCAGCGTCCAGTTTCAGCTGTGAAAGAGCTCGT 172 



123 



25 8 CGAGAACAGCCTCGACGCTGGCGCCTCCAGCGTCTCCGTTGCGGTGAAGG 307 

1 1 1 M II II II II M II I | || || | | 1 1 1 1 | 

173 TGAGAACAGCCTCGACGCCGATTCAAGTTCCATAAGCGTCGTTGTCAAAG 222 

308 ACGGTGGCCTCAAGCTCATCCAGGTCTCCGATGACGGCCATGGCATCAGG 357 

I M I I M I M Mill II I Mill II Mill II II II II 
223 ACGGTGGTTTGAAACTCATTCAAGTCTCCGACGACGGTCACGGTATTAGA 272 



358 



407 



TTTGAGGATTTGGCAATATTGTGCGAAAGGCATACTACCTCAAAGTTATC 

Mill III I Ml I Mill II Mill II II III I I 

273 CGTGAAGACTTGCCGATACTATGCGAGAGACATACAACATCGAAGCTGAC 322 
• • . . 

408 TGCATACGAGGATCTGCAGACCATA7WVTCGATGGGGTTCAGAGGGGAGG 457 

I I MUM II I II II Mill II Mill MM 

323 TAAGTTTGAGGATTTGTTCTCTCTGAGTTCAATGGGATTTAGAGGAGAGG 372 
458 CTTTGGCTAGTATGACTTATGTTGGCCATGTTACCGTGACAACGATAACA 507 

I M II II MM III 1 1 Mill III M Ml (MM II II II 

373 CATTAGCTAGTATGACCTATGTTGCTCATGTTACAGTGACTACTATTACT 422 



557 



508 GAAGGCCAATTGCACGGCTACAGGGTTTCTTACAGAGATGGTGTAATGGA 

DIM 1 1 I II II II II II MM I III III Mill Mill 

423 AAAGGCCAGATTCATGGTTATAGAGTGTCTTATAGAGATGGTGTCATGGA 472 
558 GAATGAGCCTAAGCCTTGCGCTGCGGTGAAAGGAACTCAAGTCATGGTTG 607 

I Mil II III I II Mill II 1 1 1 1 1 1 II || I Mm I 

473 GCATGAACOVAAGGCGTGTGCTGCTGTCAAAGGAACACAGATAATGGTG^ 522 
608 AAAATCTATTTTACAACATGGTAGCCCGCAAGAAAACATTGCAGAACTCC 657 

I Ml I II Mill III I II II III Ml I II II II 

523 AGAATTTGTTCTACAATATGATTGCTAGAAGGAAGACACTTCAAAATTCT 572 
658 AATGATGACTACCCCAAGATCGTAGACTTCATCAGTCGGTTTGCAGTCCA 707 

IMIM III II Mill II II I M III I II Ml 

GCTGATGATTACGGGAAAATCGTGGATTTGCTGAGCCGGATGGGTATTCA 622 



573 



708 TCACATCAACGTTACCTTCTCTTGCAGAAAGCATGGAGCCAATAGAGCAG 757 

„ I HI M M I III Mill MM Ml II I III II II I 

623 TTACAATAATGTCAGCTTTTCTTGTCGAAAGCATGGAGCTGTTAAGGCTG 672 



7 58 ATGTTCATAGTGCAAGTACATCCTGT^GGT-TAGATGCTATCAGGAGTGTC 807 

M M I II I M I IMIM I III | || III Ml 

673 ATGTTCACTCAGTCGTGTCACCTTCAAGGCTTGATTCAATTAGGTCTGTA 722 
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808 TATGGGGCTTCTGTCGTTCGTGATCTCATAGAAATAAAGGTTTCATATQA 857 

Mill I II II I I I II II II I Mill I 

723 TATGGAGTATCAGTTGCAAAGAACTTGATGAAAGTAGAAGTTTCCTCCTG 772 
858 GGATGCTGCAGATTCAATCTTCAAGATGGATGGTTACATCTCAAATGCAA 907 

M i I Ml I II I Mill -Ml III II III ! I 

773 TGACTCCTCTGGTTGTACTTTTGATATGGAGGGTTTCATATCCAATTCTA 822 
908 ATTATGTGGCAAAGAAGATTACAATGATTCTTTTCATAAATGATAGGCTT 957 

I Mill II I II Ml I II I II I III II I II 1 1 II III I I 

823 ACTATGTTGCTAAGAAGACTATATTGGTGCTTTTCATTAATGATAGATTG 872 
958 GTAGACTGTACTGCTTTGAAAAGAGCTATTGAATTTGTGTACTCTGCAAC 1007 

II II II MM II I II Ml 1 1 1 1 1 1 II 1 1 1 1 II || II III 

873 GTGGAATGCTCTGCCTTAAAAAGAGCCATTGAAATTGTTTATGCTGCAAC 922 
1008 ATTGCCTCAAGCATCCAAACCTTTCATATACATGTCCATACATCTTCCAT 1057 

I Mill MM II I MM I III I II 1 1 1 1 1 1 II II I III 

923 ATTGCCAAAAGCATCAAAACCTTTTGTCTACATGTCAATCAATTTGCCAC 972 
1058 CAGAACACGTGGATGTTAATATACACCCAACCAAGAT^AGAGGTTAGCCTT 1107 

Mill II III I Mill I III MM 1 1 II 1 1 1 1 1 II I M I IM 

973 GGGAACATGTTGATATCAATATTCACCCAACAAAGAAAGAGGTTAGCCTT 1022 
1108 TTGAATCAAGAGCGTATTATTGAAACAATAAGAAATGCTATTGAGGAAAA 1157 

I II M M I Ml MM I III I MM I Ml 

1023 CTAAACCAGGAAATCATTATTGAGATGATACAGTCAGAGGTTGAAGTAAA 1072 



1158 ACTGATGAATTCTAATACAACCAGGATATTCCAAACTCAGGCATTAAACT 1207 

MUM III III II II II II III III II 

1073 ACTGAGAAACGCAAATGATACTAGGACGTTTCAAGAGCAGAAAGT . . . . . 1117 

1208 TATCAGGGATTGC. .TCAAGCT AACCCACAAAAG GATA 1243 

Mill MM II I I II Mill I 

1118 GGAATACATTC AATCTACGTTAACATCTC7VGAAAAGTGATTCTC 1161 

1244 AGGTTTCTGAGGCCAGTATGGGTTCTGGAACAAAATCTC;VAAAAATTCCT 1293 

II MM II II MM III III I II III Mill 

1162 CAGTTTCTCAG AAGCCTTCTGGACAAAAGACACAGAAAGTTCCT 1205 

• ' • • • . 

1294 GTGAGCCAAATGGTCAGAACAGATCCACGCAATCCATCTGGAAGATTGCA 1343 

M II I II I I II I II I I I I I II II I II II II I II II I I II 

1206 GTGAACAAAATGGTGAGAACAGATTCATCAGATCCAGCTGGAAGGTTACA 1255 

1344 CACCTACTGGCACGGGCAATCTTCAAATCT . . . TGAAAAGAAATTTGATC 1390 

Ml I III II II Ml III III III 

1256 TGCCTTTTTGCAACCCAAGCCACAAAGTCTCCCTGACAAGGTTTCTAGTT 1305 

13 91 TTGTATCTGTAAGAAATGTTGTAAGATCAAGGAGAAACCAAAAAGATGCT 144 0 

I Mill I llllll III lllllll III II II 

13 06 TGAGTGTAGTAAGGTCTTCTGTAAGGCAAAGAAGAAACCCAAAGGAAACT 1355 

1441 GGTGATTTGTCAAGCCG 9 0, 

I Mil l II II II II II III I I I II Ml 

13 56 GCTGATCTTTCTAGTGTCCAGGAACTTATTGCTGGAGTTGACAGCTGCTG 14 05 
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14 91 CCATCCTGGCCTTTTGGACA^ 1540 

INI II N I Mill III I III I Mini IN 1 1 III I I 

14 06 CCATCCAGGTATGCTGGAGACTGTAAGGAATTGCACATATGTTGGAATGG 1455 
1541 CCGATGAAGCCTTTGCTTTGATACAACACAATACCCGCTTATACCTTGTA 1590 

MUM I II III II I III I II 1 1 II MM M I I 

1456 CAGATGATGTTTTTGCTTTAGTTCAGTATAACACCCATCTATATCTAGCA 1505 
1591 AATGTGGTAAATATTAGTAAAGAACTTATGTAGCAGCAAGCTTTGTGCCG 1640 

' Ml Mill III I II MIM II Mill MINI M I III 

1506 AATGTGGTGAATCTCAGCAAAGAGCTAATGTAtCAGCAAACTCTTCGTCG 1555 
1641 TTTTGGGAACTTCAATGCTATTCAGCTCAGTGAACCAGCTCCACTTCAGG 1690 

, .Mill I II II II II Mill II II Mill || | | 

1556 TTTTGCTCATTTTAACGCAATACAGCTTAGCGATCCAGCCCCTTTGTCAG 1605 
1691 AGTTGCTGGTGATGGCACTGAAAGACGATGA . ATTGAT GAGTGAT 1734 

Mill I II MM II MM II II II I IN Mill 

1606 AGTTGATATTGTTGGCTCTGAAAGAGGAGGATCTAGATCCAGGAAATGAT 1655 
1735 GAAAAGGATGATGAGAAACTGGAGATTGCAGAAGTAAACACTGAGATACT 1784 

Ml MINI II II Mill Ml in || || I || 

1656 ACAAAAGATGATCTGAAAGAAAGAATTGCTGAAATGAATACAGAACTCCT 1705 
1785 AAAAGAAAATGCTGAGATGATTAATGAGTACTTTTCTATTCACATTGATC 1834 

M Mill II II III I I MIM M I MINIM 

1706 CAAGGAAAAAGCAGAAATGTTAGAGGAGTATTTCAGCGTGCACATTGACT 1755 
1835 AAGATGGCAAATTGACAAGACTTCCTGTTGTACTGGACCAGTACACCCCT 1884 

M M Ml MM IMMIII MM ; 1 1 ! 1 1 1 1 || ||| 

1756 CCAGTGCAAATTTGTCAAGGCTTCCTGTGATACTCGACCAGTATACACCT 1805 
,1885 GATATGGACCGTCTTCCAGAATTTGTGTTGGCTTTAGGAAATGATGTTAC 1934 

M Mill III MM Mill I I I M II! II Ml 

1806 GACATGGATCGTGTTCCTGAATTTTTACTATGCTTGGGAAATGATGTTGA 1855 
1935 TTGGGATGACGAGAAAGAGTGCTTCAGAACAGTAGCTTCTGCTGTAGGAA 1984 

MMI II Mill Mill I III II I III I II I 

1856 GTGGGAAGATGAGAAGAGTTGCTTTCAAGGAGTTTCTGCAGCTATTGGGA 1905 
1985. ACTTCTATGCACTTCATCCCCCAATCCTTCCAAATCCATCTGGGAATGGC 2034 

lon MM II II I || | I Mill Mill II I II 

1906 ACTTTTACGCCATGCATCCTCCTCTTTTGCCAAACCCATCGGGTGACGGT 1955 

* • • * • • 

2035 ATTCATTTATACA AGAAAAATAGAGATTC 2063 

Hill II M I II I I II Mill 

1956 ATTCAGTTCTATAGTAAGAGAGGTGAGAGCTCTCAGGAAAAGTCAGATTT 2005 

2064 AATGGCTGATGAACATGCTGAGAATGATCTAATATCAGATGAAAATGACG 2113 

I I I I I I II III I II 
2006 AGAGGGTAACGTCGATATGGAGGACAATC . 2034 



2114 TTGATCAAGAACTTCTTGCGGAAGCAGAAGCAGCATGGGCCCAACGTGAG— 2-1-6-3- 

on ■ UN Mill Mill III || Ml || mill in in | | 

2035 TTGACCAAGATCTTCTGTCAGATGCTGAAAACGCATGGGCACAACGTGAA 2084 
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THRICE MLH1 ORTHOLOG AND USES THEREOF, 
or(s): Pramod B. Mahajan 
Eation No: Not yet assigned 
Atty DktNo: 35718/238971 (5718-142) 



2164 TGGACCATTCAGCATGTCTTGTTTCCATCCATGCGACTTTTCCTCAAGCC 2213 

ono mi i M ii ii ii in i in i M in ii i in i inn 

2 085 TGGTCAATCC^C^CGTGTTGTTTCCGTCAATGAGATTGTTCTTGAAGCC 2134 
2214 CCCGAAGTCAATGGCAACAGATGGAACGTTTGTGCAGGTTGCTTCCTTGG 2263 

91 „ 'I 'I MM I II 1 1 1 1 II HIM II 1 1 || IN | | 

2135 ACCAGCTTCCATGGCTTCAAATGGGACTTTTGTAAAGGTAGCATCCCTTG 2184 

2264 AGAAACTCTACAAGATTTTTGAAAGGTGTTAGCTCATA 2301 

I M II I I II I III II III | || || || | | 
2185 AAAAGCTGTACAAGATATTCGAACGATGCTAACTGAAA 2222 
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